Lab 9 Challenge

Author

Sebastian Boksem

Libraries:

Importing Data Set:

statesnames_a <- read.csv(here::here("supporting_artifacts", "Lab9Challenge", "StateNames_A.csv"))
statesnames_a <- statesnames_a |>
  rename("Sex" = "Gender"
         )
statesnames_a |>
  datatable(filter = 'bottom', options = list(pageLength = 100)) |>
  formatStyle(c('Name', 'Year', 'Sex', 'State', 'Count'),  color = 'red', backgroundColor = 'salmon', fontWeight = 'bold')
Warning in instance$preRenderHook(instance): It seems your data is too big
for client-side DataTables. You may consider server-side processing: https://
rstudio.github.io/DT/server.html

Summarizing and Visualizing:

3.1 (Also “spicing” up this one)

statesnames_a |> 
  pivot_wider(names_from = Sex, values_from = Count) |>
  filter(Name == 'Allison') |>
mutate(
    across(c('F', 'M'), ~replace_na(.x, 0))) |>
  group_by(State) |>
  summarize(across(F:M, sum)) |>
  rename("Number of Male-assigned Babies" = "M",
         "Number of Female-assigned Babies" = "F") |>
  datatable(
  head(statesnames_a),
  caption = htmltools::tags$caption(
    style = 'caption-side: bottom; text-align: center;',
    'Table 1: ', htmltools::em('The Number of babies are mostly Female-assigned ') 
  )
) |>   formatStyle('State',  color = 'blue', backgroundColor = 'green', fontWeight = 'bold')

3.2

statesnames_a |> 
  filter(Sex == 'F') |>
  pivot_wider(names_from = Sex, values_from = Count) |>
  filter(Name == 'Allison')  |>
  group_by(State) |>
  summarize(sum(F)) |>
    rename(
         "Number of Female-assigned Babies" = "sum(F)") |>
  kable()
State Number of Female-assigned Babies
AK 232
AL 1535
AR 1198
AZ 1880
CA 12413
CO 1594
CT 1099
DC 321
DE 294
FL 4455
GA 3257
HI 183
IA 1477
ID 451
IL 5110
IN 3067
KS 1283
KY 1905
LA 1209
MA 2218
MD 2229
ME 340
MI 4014
MN 2374
MO 2882
MS 817
MT 226
NC 3435
ND 285
NE 807
NH 412
NJ 3052
NM 399
NV 729
NY 5747
OH 5487
OK 1421
OR 1186
PA 4307
RI 306
SC 1228
SD 376
TN 2488
TX 10192
UT 1125
VA 3220
VT 135
WA 1956
WI 2367
WV 813
WY 142

Pennsylvania

Values Challenge:

statesnames_a |> 
  filter(Name == 'Allan' | Name ==  'Alan' | Name == 'Allen',
         State == 'PA' | State == 'CA',
         Year == 2000) |>
  group_by(Name, State) |>
  summarize(sum(Count)) |>
  rename("Count" = `sum(Count)`) |>
  pivot_wider(names_from = Name, values_from = Count) |>
    rename("# of Alan" = "Alan",
         "# of Allan" = "Allan",
         "# of Allen" =  "Allen") |>
  kable()
`summarise()` has grouped output by 'Name'. You can override using the
`.groups` argument.
State # of Alan # of Allan # of Allen
CA 584 131 176
PA 51 12 56

Percents Challenge:

statesnames_a |> 
  filter(Name == 'Allan' | Name ==  'Alan' | Name == 'Allen',
         State == 'PA' | State == 'CA',
         Year == 2000) |>
  group_by(Name, State) |>
  summarize(sum(Count)) |>
  rename("Count" = `sum(Count)`) |>
  pivot_wider(names_from = Name, values_from = Count) |>
  mutate(Total = Alan + Allan + Allen,
         Alan = Alan / Total,
         Allan = Allan / Total,
         Allen = Allen / Total) |>
      rename("# of Alan" = "Alan",
         "# of Allan" = "Allan",
         "# of Allen" =  "Allen") |>
  select(-Total) |>
  kable()
`summarise()` has grouped output by 'Name'. You can override using the
`.groups` argument.
State # of Alan # of Allan # of Allen
CA 0.6554433 0.1470258 0.1975309
PA 0.4285714 0.1008403 0.4705882